About

Column

About this Story

Today we are going to explore data on Educational Achievement, Income, Taxes and any corelation between those measurements across different counties of Tennessee.

Column

Where did the data come from?

Brief History of Public Education in TN

  • The state’s fisrt constitution (1796) did not mention education as an appropriation of public funds.
  • The offical recognition of public education came in 1806 within the federal Cession Act, requiring the state to furnish land for two colleges and an academy in each of the 27 existing counties within the state at the time.
  • State general assembly statues throughout 1815-1830’s proved to be ineffective due to lack of funding. County boards were independent and uncompensated.
  • 1835 state constitution included a education provision, installing a superintendent.
  • 1867 Reconstruction intiatives revive resources to be dedicated to education.
  • General Education ACt of 1909 - 25% of state gross revenue to be utilized by the counties for public schools.
  • 1954 Brown vs Board of Education and resulting desegregation in the years to come.
  • 2002 Tennessee Education Lottery ($10.8 million in sales on the first day)
  • Common Core Initiative was adopted in 2010 and implemented in 2013.
TN State Expenditure by Type 2013

TN State Expenditure by Type 2013

Storyboard

Income by County

Per Pupil Expenditure

% of Minority Students by County

Dropout Rate

Correlation Matrix of CORE Education Data

Model

Are you a Nerd? Then you may like this….

Here is an attempt to build a model based on our dataset. The model may not be perfect but the process of building the model is important here.

Call:
lm(formula = ACT_Composite ~ avg_Eng + avg_Math + avg_Sci + Dropout + 
    Pct_Suspended, data = merged_irs)

Coefficients:
  (Intercept)        avg_Eng       avg_Math        avg_Sci        Dropout  
    11.114965       0.104072       0.010566       0.020567       0.039315  
Pct_Suspended  
     0.003943  


Call:
lm(formula = ACT_Composite ~ avg_Eng + avg_Math + avg_Sci, data = merged_irs)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.2815 -0.5650 -0.1270  0.4809  1.9070 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) 12.114030   0.099123 122.212  < 2e-16 ***
avg_Eng      0.095159   0.004602  20.676  < 2e-16 ***
avg_Math     0.010742   0.001736   6.190 6.91e-10 ***
avg_Sci      0.016612   0.003327   4.994 6.29e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.7008 on 2801 degrees of freedom
Multiple R-squared:  0.6806,    Adjusted R-squared:  0.6803 
F-statistic:  1990 on 3 and 2801 DF,  p-value: < 2.2e-16

\[ACT_Composite = 12.114030 + 0.095159 * avg_Eng + 0.010742 * avg_Math + 0.016612 * avg_Sci\]

---
title: "Education, Taxes & Income"
output: 
  flexdashboard::flex_dashboard:
    orientation: columns
    vertical_layout: fill
    social: menu
    source_code: embed
    theme: readable
---

```{r setup, include=T}
library(flexdashboard)

#```{r, message = FALSE, warning = FALSE, echo=FALSE}
library("tidyverse")
library("readxl")
library("dplyr")
library("magrittr")
library("ggplot2")
library("GGally")
library("broom")
library("plyr")
library("ggmap")
library("maps")
library("mapdata")
library("rgeos")
library("maptools")
library("sp")
library("raster")
library("rgdal")
library("sf")
library("broom")
library("stringr")

#Reading in all files for the analysis 

irs <- read.csv("data/irs_avg.csv")
edu_raw <- read.csv("data/achievement_profile_data_with_CORE.csv")
merged_irs <- read.csv("data/merged.csv")
edu_act <- read.csv("data/edu_act.csv")
total_sal_2014 <- read.csv("data/total_sal_2014.csv")
usa <- map_data("usa")
states <- map_data("state")
tn_df <- subset(states, region == "tennessee")
counties <- map_data("county")
tn_counties <- subset(counties, region == "tennessee")

```

About {data-navmenu="Explore"}
===================================== 

Column {data-width=200}
-------------------------------------

### About this Story

Today we are going to explore data on Educational Achievement, Income, Taxes and any corelation between those measurements across different counties of Tennessee.

Column {data-width=800}
-------------------------------------
### Where did the data come from? 

* Academic, demographic, and disciplinary education data gathered from [the 2014 - 2015 TN.gov](https://www.tn.gov/education/data/data-downloads.html).
* Organized zipcodes by county [Zip Codes](https://www.unitedstateszipcodes.org/zip-code-database/).
*  Aggregated tax infomration organized by zipcode [2011 - 2015 IRS Data](https://www.irs.gov/statistics/soi-tax-stats-individual-income-tax-statistics-zip-code-data-soi).

### Brief History of Public Education in TN

* The state's fisrt constitution (1796) did not mention education as an appropriation of public funds.
* The offical recognition of public education came in 1806 within the federal Cession Act, requiring the state to furnish land for two colleges and an academy in each of the 27 existing counties within the state at the time.
* State general assembly statues throughout 1815-1830's proved to be ineffective due to lack of funding.  County boards were independent and uncompensated.
* 1835 state constitution included a education provision, installing a superintendent.
* 1867 Reconstruction intiatives revive resources to be dedicated to education.
* General Education ACt of 1909 - 25% of state gross revenue to be utilized by the counties for public schools.
* 1954 Brown vs Board of Education and resulting desegregation in the years to come.
* 2002 Tennessee Education Lottery ($10.8 million in sales on the first day)
* Common Core Initiative was adopted in 2010 and implemented in 2013. 

![TN State Expenditure by Type 2013](data/Tennessee_expenditures_by_type_2013.png)

Storyboard {.storyboard data-navmenu="Explore"}
=========================================
### Income by County

```{r  echo=FALSE, fig.width=15, fig.height=15}
total_sal_2014$sal_avg_z <- round((total_sal_2014$salary_avg - mean(total_sal_2014$salary_avg,na.rm=TRUE))/sd(total_sal_2014$salary_avg,na.rm=TRUE), 2)  # compute normalized mpg
total_sal_2014$sal_type <- ifelse(total_sal_2014$sal_avg_z < 0, 'below', 'above')  # above / below avg flag
total_sal_2014 <- total_sal_2014[order(total_sal_2014$sal_avg_z), ]  # sort
total_sal_2014$county <- factor(total_sal_2014$county, levels = total_sal_2014$county)  # convert to factor to retain sorted order in plot.

# Diverging Barcharts

ggplot(total_sal_2014, aes(x=county, y=sal_avg_z, label=sal_avg_z)) +
   geom_bar(stat='identity', aes(fill=sal_type), width=.5)  +
   scale_fill_manual(name='Average Salary',
                     labels = c('Above Average', 'Below Average'),
                     values = c('above'='#00ba38', 'below'='#f8766d')) +
   labs(subtitle='Diverging Bar Plot',
        title= 'Normalised Average Salary for Counties') +
   coord_flip()
```

### Per Pupil Expenditure


```{r echo=FALSE, fig.width=15, fig.height=15}
#Per Pupil Expenditure
tn_base <- ggplot(data = tn_df, mapping = aes(x = long, y = lat, group = group)) + 
  coord_fixed(1.3) + 
  geom_polygon(color = "black", fill = "gray")

core <- read.csv("data/merged_county.csv")
colnames(core)[2] <- "subregion" #renaming to match
core[[2]] <- tolower(core[[2]]) #changing to all lowercase

#removing 'county' from the column

core$subregion <- gsub(" county$", "", core$subregion)
tn_counties$subregion <- replace(tn_counties$subregion, tn_counties$subregion=="de kalb", "dekalb")

#merging the two dataframes together with nnner_join

tn_core <- inner_join(tn_counties, core, by = "subregion")

#Attempting to plot

ditch_the_axes <- theme(
  axis.text = element_blank(),
  axis.line = element_blank(),
  axis.ticks = element_blank(),
  panel.border = element_blank(),
  panel.grid = element_blank(),
  axis.title = element_blank()
)

tn_per_pupil <- tn_base + 
  geom_polygon(data = tn_core, aes(fill = Per_Pupil_Expenditures), color = "white") +
  geom_polygon(color = "black", fill = NA) +
  theme_bw() +
  ditch_the_axes

tn_per_pupil

```

### % of Minority Students by County

```{r echo=FALSE, fig.width=25, fig.height=10}
tn_per_BHN <- tn_base + 
  geom_polygon(data = tn_core, aes(fill = Pct_BHN), color = "white") +
  geom_polygon(color = "black", fill = NA) +
  theme_bw() +
  ditch_the_axes

tn_per_BHN
```

### Dropout Rate

```{r echo=FALSE, fig.width=25, fig.height=10}
tn_dropout <- tn_base + 
  geom_polygon(data = tn_core, aes(fill = Dropout), color = "white") +
  geom_polygon(color = "black", fill = NA) +
  theme_bw() +
  ditch_the_axes

tn_dropout
```

### Correlation Matrix of CORE Education Data

```{r warning=FALSE, message=FALSE, echo=FALSE, fig.width=15, fig.height=15}

edu_2 <- dplyr::select(edu_raw, system_name,BioI,Chemistry,ELA,EngI,EngII,EngIII,AlgI,AlgII,Math,Science,ACT_Composite,Graduation,Pct_Native_American,Pct_SWD,
                Pct_Black,Pct_Hispanic,Pct_EL,Pct_ED,Pct_BHN,Pct_Chronically_Absent,
                Pct_Suspended,Pct_Expelled,Per_Pupil_Expenditures,Enrollment
)
ggcorr(edu_2)

```

Model {data-navmenu="Explore"}
===================================== 

### Are you a Nerd? Then you may like this....
Here is an attempt to build a model based on our dataset. The model may not be perfect but the process of building the model is important here.
```{r fig.width=15, fig.height=15}

lm(formula = ACT_Composite ~  avg_Eng + avg_Math + avg_Sci + Dropout + Pct_Suspended,  data = merged_irs)
          model_multi <- lm(formula = ACT_Composite ~  avg_Eng + avg_Math + avg_Sci, data = merged_irs)
          plot(model_multi)
          summary(model_multi)
```
$$ACT_Composite = 12.114030 + 0.095159 * avg_Eng + 0.010742 * avg_Math + 0.016612 * avg_Sci$$